Saturate EMA sum to PROD_W range to prevent arithmetic overflow on sustained inputs#2
Open
Abraxas3d wants to merge 1 commit into
Open
Saturate EMA sum to PROD_W range to prevent arithmetic overflow on sustained inputs#2Abraxas3d wants to merge 1 commit into
Abraxas3d wants to merge 1 commit into
Conversation
The combinational sum assignment summed two shift_left'd signed terms without saturation, which could wrap past +/- 2^(PROD_W-1) when sustained high-amplitude inputs drove the accumulator past the signed PROD_W-bit range. Once wrapped, the cascade output flipped negative and the EMA settled into a stable wrong-sign equilibrium with no recovery. Symptom in our use case (polyphase channelizer power detector, AMSAT-UK FunCube+ MDT-SIC): channel-0 EMA-2 sum oscillated between near +2^42 and near -2^42 every ~280us under sustained DC input, ending tests in a negative-pinned state that contaminated subsequent measurements. Fix: compute the sum in EXTRA_W=4 wider intermediate (sum_wide), then clamp to +/- (2^(PROD_W-1) - 1) before assigning to sum. Preserves bit-exact behavior for in-range values; clamps gracefully at the limits. Verified: full 1ms simulation across 64 EMA cascades shows MSB stays 0 throughout, no wrap events, all 9 testbench assertions pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The combinational
sumassignment inlowpass_ema.vhdsums twoshift_left'd signed terms without saturation. When sustained high-amplitude inputs drive the accumulator past+/- (2^(PROD_W-1) - 1), the result wraps. Once wrapped, the cascade output flips sign and the EMA settles into a stable wrong-sign equilibrium with no path to recovery. The slow time constant that gives the EMA its smoothing power also locks it into the wrong half-plane.This PR adds an
EXTRA_W = 4wider intermediate signal (sum_wide) for the combinational add, then clamps to+/- (2^(PROD_W-1) - 1)before assigning tosum. Behavior is bit-exact identical for in-range values. It clamps gracefully at the limits.Background
We're integrating
lowpass_emaas the inner stage of a cascaded power-detection EMA in the 64-channel polyphase channelizer for Haifuraiya. It will also be used for SIC-MDT, AMSAT-UK's FunCube+ mission.Our typical config (PD per channel):
ALPHA_W=18,PROD_W=43,SUM_SHIFT_W=18,MULT_DATA_SHIFT=0,MULT_SUM_SHIFT=1, with two EMAs in cascade. The outer (slow) EMA usesalpha2 = 2^-12for ~4096-frame time constant on power readings.Note: This PR is independent of and complementary to #1 (data_ena gating), which open sepparately. The two fixes address unrelated concerns and should not mess each other up.
The bug, demonstrated
During a testbench run with sustained DC input applied to a channel, we observed channel 0's reported power register reading back as
1,785,562,731(unsigned). Reinterpreted as signed 32-bit:-2,509,404,565this is super clearly wrong for a power magnitude.We probed
u_ema_2/sum(43-bit signed) on the affected channel at 10 µs intervals across the 1 ms test:The accumulator was actively oscillating through wraparound every ~280 µs under sustained DC. The test sampled
sumat a moment when MSB was 1, which is why downstream code read back a negative-interpreted-as-large-unsigned value.Root cause:
shift_left(mult_sum, MULT_SUM_SHIFT)plusshift_left(mult_data, MULT_DATA_SHIFT)exceeds the 43-bit signed range when both summerands are near their respective maxima. With sustained high-amplitude integration, this happens repeatedly. This sustained high-amplitude integration is a harsh use case, but it can happen.The fix
src/lowpass_ema.vhdadds saturation in the combinational sum computation:Synthesis cost: one comparison and a 2:1 mux per
sumbit. Negligible on Zynq UltraScale+ or iCE40UP5K (our two current targets).Verification
Same testbench, same input pattern, post-fix probe of
sumMSB:No wraps. No sign flips. The accumulator clamps cleanly at the upper limit when input pushes it past
+(2^(PROD_W-1) - 1), and the EMA continues to behave as a low-pass filter at the saturation boundary.End-to-end result: 9/9 testbench assertions pass across the full 64-channel polyphase channelizer with 64 instances of this
lowpass_emamodule in production cascade configuration.Notes for Matthew et al
EXTRA_W = 4is chosen conservatively to absorb any reasonable combination ofMULT_DATA_SHIFTandMULT_SUM_SHIFT(max 3 bits each in our config) plus the +1 bit for the addition. Maybe this should be parameterized as well?? Or leave it up to implementers to compute it from the existing shift parameters? Not sure what to do about this.The original
sum <= ...line is preserved as a comment immediately above the new code so we can see exactly what's being replaced without flipping between commits. It can be removed if this all looks reasonable and it's too much clutter.I have not changed any timing or interface
sumkeeps the same type, width, and combinational nature.